2,158 research outputs found

    Efficient genotype compression and analysis of large genetic-variation data sets

    Get PDF
    Genotype Query Tools (GQT) is an indexing strategy that expedites analyses of genome-variation data sets in Variant Call Format based on sample genotypes, phenotypes and relationships. GQT's compressed genotype index minimizes decompression for analysis, and its performance relative to that of existing methods improves with cohort size. We show substantial (up to 443-fold) gains in performance over existing methods and demonstrate GQT's utility for exploring massive data sets involving thousands to millions of genomes. GQT can be accessed at https://github.com/ryanlayer/gqt

    Real-time selective sequencing using nanopore technology

    Get PDF
    The Oxford Nanopore Technologies MinION sequencer enables the selection of specific DNA molecules for sequencing by reversing the driving voltage across individual nanopores. To directly select molecules for sequencing, we used dynamic time warping to match reads to reference sequences. We demonstrate our open-source Read Until software in real-time selective sequencing of regions within small genomes, individual amplicon enrichment and normalization of an amplicon set

    Singularities in the optical response of cuprates

    Full text link
    We argue that the detailed analysis of the optical response in cuprate superconductors allows one to verify the magnetic scenario of superconductivity in cuprates, as for strong coupling charge carriers to antiferromagnetic spin fluctuations, the second derivative of optical conductivity should contain detectable singularities at 2Δ+Δspin2\Delta +\Delta_{\rm spin}, 4Δ4\Delta, and 2Δ+2Δspin2\Delta+2\Delta_{\rm spin}, where Δ\Delta is the amplitude of the superconducting gap, and Δs\Delta_{s} is the resonance energy of spin fluctuations measured in neutron scattering. We argue that there is a good chance that these singularities have already been detected in the experiments on optimally doped YBCOYBCO.Comment: 6 pages, 4 figure

    Deeply sequenced metagenome and metatranscriptome of a biogas-producing microbial community from an agricultural production-scale biogas plant

    Get PDF
    Bremges A, Maus I, Belmann P, et al. Deeply sequenced metagenome and metatranscriptome of a biogas-producing microbial community from an agricultural production-scale biogas plant. GigaScience. 2015;4(1): 33.Background The production of biogas takes place under anaerobic conditions and involves microbial decomposition of organic matter. Most of the participating microbes are still unknown and non-cultivable. Accordingly, shotgun metagenome sequencing currently is the method of choice to obtain insights into community composition and the genetic repertoire. Findings Here, we report on the deeply sequenced metagenome and metatranscriptome of a complex biogas-producing microbial community from an agricultural production-scale biogas plant. We assembled the metagenome and, as an example application, show that we reconstructed most genes involved in the methane metabolism, a key pathway involving methanogenesis performed by methanogenic Archaea. This result indicates that there is sufficient sequencing coverage for most downstream analyses. Conclusions Sequenced at least one order of magnitude deeper than previous studies, our metagenome data will enable new insights into community composition and the genetic potential of important community members. Moreover, mapping of transcripts to reconstructed genome sequences will enable the identification of active metabolic pathways in target organisms

    Cancer cells exploit an orphan RNA to drive metastatic progression.

    Get PDF
    Here we performed a systematic search to identify breast-cancer-specific small noncoding RNAs, which we have collectively termed orphan noncoding RNAs (oncRNAs). We subsequently discovered that one of these oncRNAs, which originates from the 3' end of TERC, acts as a regulator of gene expression and is a robust promoter of breast cancer metastasis. This oncRNA, which we have named T3p, exerts its prometastatic effects by acting as an inhibitor of RISC complex activity and increasing the expression of the prometastatic genes NUPR1 and PANX2. Furthermore, we have shown that oncRNAs are present in cancer-cell-derived extracellular vesicles, raising the possibility that these circulating oncRNAs may also have a role in non-cell autonomous disease pathogenesis. Additionally, these circulating oncRNAs present a novel avenue for cancer fingerprinting using liquid biopsies

    Plant-RRBS, a bisulfite and next-generation sequencing-based methylome profiling method enriching for coverage of cytosine positions

    Get PDF
    Background: Cytosine methylation in plant genomes is important for the regulation of gene transcription and transposon activity. Genome-wide methylomes are studied upon mutation of the DNA methyltransferases, adaptation to environmental stresses or during development. However, from basic biology to breeding programs, there is a need to monitor multiple samples to determine transgenerational methylation inheritance or differential cytosine methylation. Methylome data obtained by sodium hydrogen sulfite (bisulfite)-conversion and next-generation sequencing (NGS) provide genome- wide information on cytosine methylation. However, a profiling method that detects cytosine methylation state dispersed over the genome would allow high-throughput analysis of multiple plant samples with distinct epigenetic signatures. We use specific restriction endonucleases to enrich for cytosine coverage in a bisulfite and NGS-based profiling method, which was compared to whole-genome bisulfite sequencing of the same plant material. Methods: We established an effective methylome profiling method in plants, termed plant-reduced representation bisulfite sequencing (plant-RRBS), using optimized double restriction endonuclease digestion, fragment end repair, adapter ligation, followed by bisulfite conversion, PCR amplification and NGS. We report a performant laboratory protocol and a straightforward bioinformatics data analysis pipeline for plant-RRBS, applicable for any reference-sequenced plant species. Results: As a proof of concept, methylome profiling was performed using an Oryza sativa ssp. indica pure breeding line and a derived epigenetically altered line (epiline). Plant-RRBS detects methylation levels at tens of millions of cytosine positions deduced from bisulfite conversion in multiple samples. To evaluate the method, the coverage of cytosine positions, the intra-line similarity and the differential cytosine methylation levels between the pure breeding line and the epiline were determined. Plant-RRBS reproducibly covers commonly up to one fourth of the cytosine positions in the rice genome when using MspI-DpnII within a group of five biological replicates of a line. The method predominantly detects cytosine methylation in putative promoter regions and not-annotated regions in rice. Conclusions: Plant-RRBS offers high-throughput and broad, genome- dispersed methylation detection by effective read number generation obtained from reproducibly covered genome fractions using optimized endonuclease combinations, facilitating comparative analyses of multi-sample studies for cytosine methylation and transgenerational stability in experimental material and plant breeding populations

    Predicting cell types and genetic variations contributing to disease by combining GWAS and epigenetic data

    Get PDF
    Genome-wide association studies (GWASs) identify single nucleotide polymorphisms (SNPs) that are enriched in individuals suffering from a given disease. Most disease-associated SNPs fall into non-coding regions, so that it is not straightforward to infer phenotype or function; moreover, many SNPs are in tight genetic linkage, so that a SNP identified as associated with a particular disease may not itself be causal, but rather signify the presence of a linked SNP that is functionally relevant to disease pathogenesis. Here, we present an analysis method that takes advantage of the recent rapid accumulation of epigenomics data to address these problems for some SNPs. Using asthma as a prototypic example; we show that non-coding disease-associated SNPs are enriched in genomic regions that function as regulators of transcription, such as enhancers and promoters. Identifying enhancers based on the presence of the histone modification marks such as H3K4me1 in different cell types, we show that the location of enhancers is highly cell-type specific. We use these findings to predict which SNPs are likely to be directly contributing to disease based on their presence in regulatory regions, and in which cell types their effect is expected to be detectable. Moreover, we can also predict which cell types contribute to a disease based on overlap of the disease-associated SNPs with the locations of enhancers present in a given cell type. Finally, we suggest that it will be possible to re-analyze GWAS studies with much higher power by limiting the SNPs considered to those in coding or regulatory regions of cell types relevant to a given disease

    A novel class of microRNA-recognition elements that function only within open reading frames.

    Get PDF
    MicroRNAs (miRNAs) are well known to target 3' untranslated regions (3' UTRs) in mRNAs, thereby silencing gene expression at the post-transcriptional level. Multiple reports have also indicated the ability of miRNAs to target protein-coding sequences (CDS); however, miRNAs have been generally believed to function through similar mechanisms regardless of the locations of their sites of action. Here, we report a class of miRNA-recognition elements (MREs) that function exclusively in CDS regions. Through functional and mechanistic characterization of these 'unusual' MREs, we demonstrate that CDS-targeted miRNAs require extensive base-pairing at the 3' side rather than the 5' seed; cause gene silencing in an Argonaute-dependent but GW182-independent manner; and repress translation by inducing transient ribosome stalling instead of mRNA destabilization. These findings reveal distinct mechanisms and functional consequences of miRNAs that target CDS versus the 3' UTR and suggest that CDS-targeted miRNAs may use a translational quality-control-related mechanism to regulate translation in mammalian cells

    Plasmodium knowlesi Genome Sequences from Clinical Isolates Reveal Extensive Genomic Dimorphism.

    Get PDF
    Plasmodium knowlesi is a newly described zoonosis that causes malaria in the human population that can be severe and fatal. The study of P. knowlesi parasites from human clinical isolates is relatively new and, in order to obtain maximum information from patient sample collections, we explored the possibility of generating P. knowlesi genome sequences from archived clinical isolates. Our patient sample collection consisted of frozen whole blood samples that contained excessive human DNA contamination and, in that form, were not suitable for parasite genome sequencing. We developed a method to reduce the amount of human DNA in the thawed blood samples in preparation for high throughput parasite genome sequencing using Illumina HiSeq and MiSeq sequencing platforms. Seven of fifteen samples processed had sufficiently pure P. knowlesi DNA for whole genome sequencing. The reads were mapped to the P. knowlesi H strain reference genome and an average mapping of 90% was obtained. Genes with low coverage were removed leaving 4623 genes for subsequent analyses. Previously we identified a DNA sequence dimorphism on a small fragment of the P. knowlesi normocyte binding protein xa gene on chromosome 14. We used the genome data to assemble full-length Pknbpxa sequences and discovered that the dimorphism extended along the gene. An in-house algorithm was developed to detect SNP sites co-associating with the dimorphism. More than half of the P. knowlesi genome was dimorphic, involving genes on all chromosomes and suggesting that two distinct types of P. knowlesi infect the human population in Sarawak, Malaysian Borneo. We use P. knowlesi clinical samples to demonstrate that Plasmodium DNA from archived patient samples can produce high quality genome data. We show that analyses, of even small numbers of difficult clinical malaria isolates, can generate comprehensive genomic information that will improve our understanding of malaria parasite diversity and pathobiology
    • …
    corecore